Skip to content

fix(webhook): merge dataset annotations in multi-round FUSE injection#5974

Merged
cheyang merged 2 commits into
fluid-cloudnative:masterfrom
Sukuna0007Abhi:master
Jun 15, 2026
Merged

fix(webhook): merge dataset annotations in multi-round FUSE injection#5974
cheyang merged 2 commits into
fluid-cloudnative:masterfrom
Sukuna0007Abhi:master

Conversation

@Sukuna0007Abhi

Copy link
Copy Markdown
Contributor

Ⅰ. Describe what this PR does

This PR fixes an edge case in the DatasetUsageInjector webhook where it was completely overwriting the fluid.io/dataset-in-use annotation instead of merging it.

I noticed there was a // TODO comment left in the codebase mentioning that multi-round sidecar injection wasn't fully supported yet. So, if a Pod got mutated in multiple rounds (for example, injecting different datasets sequentially), the webhook would wipe out the datasets from earlier rounds and only keep the last one.

To fix this, I updated the logic to check if the annotation already exists. If it does, we parse the existing comma-separated string, merge it with the new datasets using a Go map to avoid duplicates, sort them, and safely update the annotation.

Ⅱ. Does this pull request fix one issue?

fixes #5973

Ⅲ. List the added test cases (unit test/integration test) if any, please explain if no tests are needed.

I updated the Ginkgo unit tests in pkg/webhook/plugins/datasetusageinjector/dataset_usage_injector_test.go. Specifically, I modified the Context("when pod has different annotation value") block to explicitly verify that the old dataset and the new dataset are correctly merged into a single, sorted comma-separated string, rather than being overwritten.

Ⅳ. Describe how to verify it

You can verify this by running the standard webhook test suite: go test -v ./pkg/webhook/plugins/datasetusageinjector/...

To manually verify, you can create a Pod that requests multiple datasets which trigger sequential mutations, and then check the Pod's annotations using kubectl get pod -o yaml. The fluid.io/dataset-in-use annotation will now contain all the injected datasets.

Ⅴ. Special notes for reviews

The string manipulation and deduplication logic was kept as lightweight as possible to avoid adding any overhead to the admission controller's critical path.

Signed-off-by: Sukuna0007Abhi <appsonly310@gmail.com>
@fluid-e2e-bot

fluid-e2e-bot Bot commented Jun 10, 2026

Copy link
Copy Markdown

Hi @Sukuna0007Abhi. Thanks for your PR.

I'm waiting for a fluid-cloudnative member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the DatasetUsageInjector to support multi-round injection by merging existing dataset usage annotations with new ones instead of overwriting them. The review feedback points out a potential issue where malformed or empty values in existing annotations could lead to empty strings in the final annotation list, and suggests filtering out empty strings after trimming.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread pkg/webhook/plugins/datasetusageinjector/dataset_usage_injector.go
@Sukuna0007Abhi

Copy link
Copy Markdown
Contributor Author

pls review @cheyang @TrafalgarZZZ @xliuqq

@cheyang cheyang left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up, @Sukuna0007Abhi — nice first contribution!

The approach looks solid: reading the existing annotation, deduplicating via a map, and writing back the sorted union. A few observations:

What looks good:

  • Nil-map access on pod.Annotations is safe in Go (returns zero value), so the read path is correct even when annotations haven't been initialized yet.
  • The len(existingVal) > 0 guard prevents splitting an empty string into [""].
  • Deduplication + sorted output ensures idempotent behavior across rounds — nice.
  • The TODO comment removal is appropriate now that the issue is addressed.

Minor nit (non-blocking):
When parsing the existing annotation value, strings.TrimSpace(ds) could produce an empty string if the stored value is somehow malformed (e.g., trailing comma: "a,b,"). Adding a quick filter would make it fully defensive:

for _, ds := range existingDatasets {
    trimmed := strings.TrimSpace(ds)
    if trimmed != "" {
        datasetsInUseMap[trimmed] = struct{}{}
    }
}

In practice this is unlikely since this code is the only writer of the annotation, but it's a cheap guard. Up to you whether to address it — won't block on this.

Tests: The updated test correctly validates the merge scenario. If you're feeling ambitious, a test for duplicate detection (same dataset already in annotation + in current runtimeInfos) would round things out, but what's here is adequate.

Overall this is a clean, well-scoped fix. LGTM from a code-correctness standpoint.

/ok-to-test

@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 63.57%. Comparing base (dae5534) to head (7931adb).
⚠️ Report is 10 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5974      +/-   ##
==========================================
+ Coverage   63.54%   63.57%   +0.03%     
==========================================
  Files         478      479       +1     
  Lines       33217    33290      +73     
==========================================
+ Hits        21107    21165      +58     
- Misses      10432    10445      +13     
- Partials     1678     1680       +2     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Sukuna0007Abhi

Copy link
Copy Markdown
Contributor Author

Thanks for picking this up, @Sukuna0007Abhi — nice first contribution!

The approach looks solid: reading the existing annotation, deduplicating via a map, and writing back the sorted union. A few observations:

What looks good:

  • Nil-map access on pod.Annotations is safe in Go (returns zero value), so the read path is correct even when annotations haven't been initialized yet.
  • The len(existingVal) > 0 guard prevents splitting an empty string into [""].
  • Deduplication + sorted output ensures idempotent behavior across rounds — nice.
  • The TODO comment removal is appropriate now that the issue is addressed.

Minor nit (non-blocking): When parsing the existing annotation value, strings.TrimSpace(ds) could produce an empty string if the stored value is somehow malformed (e.g., trailing comma: "a,b,"). Adding a quick filter would make it fully defensive:

for _, ds := range existingDatasets {
    trimmed := strings.TrimSpace(ds)
    if trimmed != "" {
        datasetsInUseMap[trimmed] = struct{}{}
    }
}

In practice this is unlikely since this code is the only writer of the annotation, but it's a cheap guard. Up to you whether to address it — won't block on this.

Tests: The updated test correctly validates the merge scenario. If you're feeling ambitious, a test for duplicate detection (same dataset already in annotation + in current runtimeInfos) would round things out, but what's here is adequate.

Overall this is a clean, well-scoped fix. LGTM from a code-correctness standpoint.

/ok-to-test

Thanks for the review, thanks for approving and merging 🙌

@Sukuna0007Abhi

Copy link
Copy Markdown
Contributor Author

Pls approve @cheyang @TrafalgarZZZ

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes the DatasetUsageInjector mutating webhook so multi-round FUSE sidecar injection merges into the existing fluid.io/datasets-in-use annotation instead of overwriting it, preserving dataset usage tracking across sequential mutations.

Changes:

  • Merge existing datasets-in-use annotation value with newly injected dataset names via a deduplicating map and sorted output.
  • Remove the prior TODO by implementing the previously missing multi-round behavior.
  • Update the Ginkgo test to assert merge behavior rather than overwrite behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
pkg/webhook/plugins/datasetusageinjector/dataset_usage_injector.go Implements merge/dedupe/sort when writing the datasets-in-use annotation.
pkg/webhook/plugins/datasetusageinjector/dataset_usage_injector_test.go Updates unit test expectations to validate merged annotation output.
Comments suppressed due to low confidence (1)

pkg/webhook/plugins/datasetusageinjector/dataset_usage_injector_test.go:176

  • The updated test only covers merging when the existing annotation contains a single dataset. Since the new logic also parses comma-separated values, trims whitespace, and deduplicates via a map, add coverage for (1) existing annotation with multiple entries / whitespace and (2) a duplicate dataset across rounds to ensure the merge logic behaves as intended.
			It("should merge the existing annotation with the new one", func() {
				pod := &corev1.Pod{
					ObjectMeta: metav1.ObjectMeta{
						Name:      "test-update",
						Namespace: "fluid-test",

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/webhook/plugins/datasetusageinjector/dataset_usage_injector.go
@cheyang

cheyang commented Jun 13, 2026

Copy link
Copy Markdown
Collaborator

Summary: Clean, well-scoped fix that correctly addresses the multi-round FUSE injection annotation overwrite bug.

The approach is sound: read existing annotation values into a map for deduplication, merge new datasets, sort, and write back. Key observations:

  • Nil-safety is handled correctly. Reading from a nil pod.Annotations map in Go returns the zero value, and the len(pod.Annotations) == 0 guard before writing ensures no nil-map panic.
  • The len(existingVal) > 0 check correctly avoids splitting an empty string into [""].
  • Sorted, deduplicated output ensures idempotent behavior across injection rounds.
  • TODO comment removal is appropriate now that the issue is resolved.

Non-blocking suggestion: Filter out empty strings after TrimSpace to guard against malformed annotation values (trailing commas, etc.). This is defensive since this code is the sole writer, but it is a one-line addition.

Test coverage note: The updated test validates the merge scenario well. A test for partial-overlap deduplication (annotation has datasets A and B, runtimeInfos has B and C, expect A,B,C) would round things out but is not required.

No blocking issues found. The fix is correct and ready for merge.


Review by fluid-cloudnative/fluid agent reviewer

Comment thread pkg/webhook/plugins/datasetusageinjector/dataset_usage_injector.go
…partial overlap test

Signed-off-by: Sukuna0007Abhi <appsonly310@gmail.com>
@sonarqubecloud

Copy link
Copy Markdown

@Sukuna0007Abhi Sukuna0007Abhi requested a review from cheyang June 14, 2026 07:53

@TrafalgarZZZ TrafalgarZZZ left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@TrafalgarZZZ

Copy link
Copy Markdown
Member

@Sukuna0007Abhi Thanks for fixing this! There's a lint failure in current branch. Could you help also fix that? And we can merge it.

@cheyang cheyang left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@fluid-e2e-bot

fluid-e2e-bot Bot commented Jun 15, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cheyang, TrafalgarZZZ

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [TrafalgarZZZ,cheyang]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cheyang cheyang merged commit 4601594 into fluid-cloudnative:master Jun 15, 2026
25 of 28 checks passed
@Sukuna0007Abhi

Copy link
Copy Markdown
Contributor Author

Thank you @cheyang @TrafalgarZZZ

@Sukuna0007Abhi Thanks for fixing this! There's a lint failure in current branch. Could you help also fix that? And we can merge it.

Sorry I was bit offline in that time, but it's merged now thanks, I think the lint got fixed or if not, I will raise another pr,

Thanks again @cheyang @TrafalgarZZZ

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] DatasetUsageInjector overwrites datasets in multi-round FUSE injection

4 participants